\begin{abstract}
Diagnosing performance problems in large-scale data-intensive programs is inherently difficult due to its scale and distributed nature. This paper presents a non-intrusive, effective, distributed monitoring tool to facilitate performance diagnosis on Hadoop platform, an open source implementation of the well-known MapReduce programming framework.

The tool explores data from Hadoop logs and aggregates per-job/per-node operating system level metrics on each cluster node. It provides multi-view visualization and view-zooming functionality to assist application developers in better reasoning about job execution progress and system behavior. Experimental results show that, due to the extra levels of information it reveals as well as its effective visualization scheme, our tool is able to diagnose some kinds of problems that previous tools fail to.
\end{abstract}
